On Mean Shift Clustering for Directional Data on a Hypersphere
نویسندگان
چکیده
The mean shift clustering algorithm is a useful tool for clustering numeric data. Recently, Chang-Chien et al. [1] proposed a mean shift clustering algorithm for circular data that are directional data on a plane. In this paper, we extend the mean shift clustering for directional data on a hypersphere. The three types of mean shift procedures are considered. With the proposed mean shift clustering for the data on a hypersphere it is not necessary to give the number of clusters since it can automatically find a final cluster number with good clustering centers. Several numerical examples are used to demonstrate its effectiveness and superiority of the proposed method.
منابع مشابه
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions
Several large scale data mining applications, such as text categorization and gene expression analysis, involve high-dimensional data that is also inherently directional in nature. Often such data is L2 normalized so that it lies on the surface of a unit hypersphere. Popular models such as (mixtures of) multi-variate Gaussians are inadequate for characterizing such data. This paper proposes a g...
متن کاملMixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings
Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text...
متن کاملOn mean shift-based clustering for circular data
Cluster analysis is a useful tool for data analysis. Clustering methods are used to partition a data set into clusters such that the data points in the same cluster are the most similar to each other and the data points in the different clusters are the most dissimilar. The mean shift was originally used as a kernel-type weighted mean procedure that had been proposed as a clustering algorithm. ...
متن کاملHypersphere Sampling for Accelerating High-Dimension and Low-Failure Probability Circuit-Yield Analysis
This paper proposes a novel and an efficient method termed hypersphere sampling to estimate the circuit yield of low-failure probability with a large number of variable sources. Importance sampling using a mean-shift Gaussian mixture distribution as an alternative distribution is used for yield estimation. Further, the proposed method is used to determine the shift locations of the Gaussian dis...
متن کاملClustering of Variables Based On a Probabilistic Approach Defined on the Hypersphere
We consider n individuals described by p standardized variables, represented by points of the surface of the unit hypersphere Sn-1. For a previous choice of n individuals we suppose that the set of observables variables comes from a mixture of bipolar Watson distribution defined on the hypersphere. EM and Dynamic Clusters algorithms are used for identification of such mixture. We obtain estimat...
متن کامل